Discrete representation learning for handwritten text recognition
نویسندگان
چکیده
Abstract Handwritten text recognition, i.e., the conversion of scanned handwritten documents into machine-readable text, is a complex exercise due to variability and complexity handwriting. A common approach in recognition consists feature extraction step followed by recognizer. In this paper, we propose novel DNN architecture for that extracts discrete representation from input text-line image. The proposed model constructed an encoder–decoder network with added quantization layer which applies dictionary representative vectors discretize latent variables. parameters are trained jointly through k -means algorithm back propagation, respectively. performance suggested evaluated conducting extensive experiments on five datasets, analyzing effect handwriting recognition. results demonstrate use discretization improves deep models when compared conventional continuous representation. Specifically, character error rate decreased $$22\%$$ 22 % $$21.1\%$$ 21.1 IAM ICFHR18
منابع مشابه
Active Learning for Historic Handwritten Text Recognition
This thesis examines the use of active learning for the task of handwritten text recognition in historical documents. Active learning is a machine learning paradigm which enables the learner to select the data that is being trained on. In domains where procuring annotated data is expensive but there are large amounts of unlabelled data available, active learning can lead to better models with t...
متن کاملSentence Boundary Detection for Handwritten Text Recognition
In the larger context of handwritten text recognition systems many natural language processing techniques can potentially be applied to the output of such systems. However, these techniques often assume that the input is segmented into meaningful units, such as sentences. This paper investigates the use of hidden-event language models and a maximum entropy based method for sentence boundary det...
متن کاملSelf-training for Handwritten Text Line Recognition
Off-line handwriting recognition deals with the task of automatically recognizing handwritten text from images, for example from scanned sheets of paper. Due to the tremendous variations of writing styles encountered between different individuals, this is a very challenging task. Traditionally, a recognition system is trained by using a large corpus of handwritten text that has to be transcribe...
متن کاملHandwritten Text Recognition for Ancient Documents
Huge amounts of legacy documents are being published by on-line digital libraries world wide. However, for these raw digital images to be really useful, they need to be transcribed into a textual electronic format that would allow unrestricted indexing, browsing and querying. In some cases, adequate transcriptions of the handwritten text images are already available. In this work three systems ...
متن کاملHandwritten Text Recognition for Historical Documents
The amount of digitized legacy documents has been rising dramatically over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents. The vast majority of them remain waiting to be transcribed into a textual electronic format (such as ASCII or PDF) that would provide historians and other researchers new ways of indexing, consulting and que...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neural Computing and Applications
سال: 2023
ISSN: ['0941-0643', '1433-3058']
DOI: https://doi.org/10.1007/s00521-023-08445-9